Claremont
A signal separation view of classification
The problem of classification in machine learning has often been approached in terms of function approximation. In this paper, we propose an alternative approach for classification in arbitrary compact metric spaces which, in theory, yields both the number of classes, and a perfect classification using a minimal number of queried labels. Our approach uses localized trigonometric polynomial kernels initially developed for the point source signal separation problem in signal processing. Rather than point sources, we argue that the various classes come from different probability distributions. The localized kernel technique developed for separating point sources is then shown to separate the supports of these distributions. This is done in a hierarchical manner in our MASC algorithm to accommodate touching/overlapping class boundaries. We illustrate our theory on several simulated and real life datasets, including the Salinas and Indian Pines hyperspectral datasets and a document dataset.
Active Learning Classification from a Signal Separation Perspective
Mhaskar, Hrushikesh, O'Dowd, Ryan, Tsoukanis, Efstratios
In machine learning, classification is usually seen as a function approximation problem, where the goal is to learn a function that maps input features to class labels. In this paper, we propose a novel clustering and classification framework inspired by the principles of signal separation. This approach enables efficient identification of class supports, even in the presence of overlapping distributions. We validate our method on real-world hyperspectral datasets Salinas and Indian Pines. The experimental results demonstrate that our method is competitive with the state of the art active learning algorithms by using a very small subset of data set as training points.
SWA-LDM: Toward Stealthy Watermarks for Latent Diffusion Models
Yang, Zhonghao, Lyu, Linye, Chang, Xuanhang, He, Daojing, LI, YU
In the rapidly evolving landscape of image generation, Latent Diffusion Models (LDMs) have emerged as powerful tools, enabling the creation of highly realistic images. However, this advancement raises significant concerns regarding copyright infringement and the potential misuse of generated content. Current watermarking techniques employed in LDMs often embed constant signals to the generated images that compromise their stealthiness, making them vulnerable to detection by malicious attackers. In this paper, we introduce SWA-LDM, a novel approach that enhances watermarking by randomizing the embedding process, effectively eliminating detectable patterns while preserving image quality and robustness. Our proposed watermark presence attack reveals the inherent vulnerabilities of existing latent-based watermarking methods, demonstrating how easily these can be exposed. Through comprehensive experiments, we validate that SWA-LDM not only fortifies watermark stealthiness but also maintains competitive performance in watermark robustness and visual fidelity. This work represents a pivotal step towards securing LDM-generated images against unauthorized use, ensuring both copyright protection and content integrity in an era where digital image authenticity is paramount.
Understanding and Supporting Formal Email Exchange by Answering AI-Generated Questions
Miura, Yusuke, Yang, Chi-Lan, Kuribayashi, Masaki, Matsumoto, Keigo, Kuzuoka, Hideaki, Morishima, Shigeo
Replying to formal emails is time-consuming and cognitively demanding, as it requires crafting polite phrasing and providing an adequate response to the sender's demands. Although systems with Large Language Models (LLMs) were designed to simplify the email replying process, users still need to provide detailed prompts to obtain the expected output. Therefore, we proposed and evaluated an LLM-powered question-and-answer (QA)-based approach for users to reply to emails by answering a set of simple and short questions generated from the incoming email. We developed a prototype system, ResQ, and conducted controlled and field experiments with 12 and 8 participants. Our results demonstrated that the QA-based approach improves the efficiency of replying to emails and reduces workload while maintaining email quality, compared to a conventional prompt-based approach that requires users to craft appropriate prompts to obtain email drafts. We discuss how the QA-based approach influences the email reply process and interpersonal relationship dynamics, as well as the opportunities and challenges associated with using a QA-based approach in AI-mediated communication.
SPOCK 2.0: Update to the FeatureClassifier in the Stability of Planetary Orbital Configurations Klassifier
Thadhani, Elio, Ba, Yolanda, Rein, Hanno, Tamayo, Daniel
ABSTRACT The Stability of Planetary Orbital Configurations Klassifier (SPOCK) package collects machine learning models for predicting the stability and collisional evolution of compact planetary systems. In this paper we explore improvements to SPOCK's binary stability classifier (FeatureClassifier), which predicts orbital stability by collecting data over a short N-body integration of a system. We additionally discovered that 10% of N-body integrations in SPOCK's original training dataset were duplicated by accident, and that < 1% were misclassified as stable when they in fact led to ejections. We provide a cleaned dataset of 100,000+ unique integrations, release a newly trained stability classification model, and make minor updates to the API. INTRODUCTION clude systems that go unstable during the short integration phase; which slightly reduces the model AUC Determining orbital stability over planetary systems' from 0.9527 to 0.9490 (an AUC of 1 would be a perfect typical lifetimes of several Gyr through direct numerical model).